130 research outputs found
Estimating Statistics on Words Using Ambiguous Descriptions
In this article we propose an alternative way to prove some recent results on statistics on words, such as the expected number of runs or the expected sum of the run exponents. Our approach consists in designing a general framework, based on the symbolic method developped in analytic combinatorics. The descriptions obtained in this framework are built in such a way that the degree of ambiguity of an object O (i.e., the number of different descriptions corresponding to O) is exactly the value of the statistic under study for O. The asymptotic estimation of the expectation is then done using classical techniques from analytic combinatorics. To show the generality of our method, we not only apply it to obtain new proofs of known results but also extend them from the uniform distribution to any memoryless distribution
On the Average Size of Glushkov's Automata
12 pagesInternational audienceGlushkov's algorithm builds an epsilon-free nondeterministic automaton from a given regular expression. In the worst case, its number of states is linear and its number of transitions is quadratic in the size of the expression. We show in this paper that in average, the number of transitions is linear
Generic properties of subgroups of free groups and finite presentations
Asymptotic properties of finitely generated subgroups of free groups, and of
finite group presentations, can be considered in several fashions, depending on
the way these objects are represented and on the distribution assumed on these
representations: here we assume that they are represented by tuples of reduced
words (generators of a subgroup) or of cyclically reduced words (relators).
Classical models consider fixed size tuples of words (e.g. the few-generator
model) or exponential size tuples (e.g. Gromov's density model), and they
usually consider that equal length words are equally likely. We generalize both
the few-generator and the density models with probabilistic schemes that also
allow variability in the size of tuples and non-uniform distributions on words
of a given length.Our first results rely on a relatively mild prefix-heaviness
hypothesis on the distributions, which states essentially that the probability
of a word decreases exponentially fast as its length grows. Under this
hypothesis, we generalize several classical results: exponentially generically
a randomly chosen tuple is a basis of the subgroup it generates, this subgroup
is malnormal and the tuple satisfies a small cancellation property, even for
exponential size tuples. In the special case of the uniform distribution on
words of a given length, we give a phase transition theorem for the central
tree property, a combinatorial property closely linked to the fact that a tuple
freely generates a subgroup. We then further refine our results when the
distribution is specified by a Markovian scheme, and in particular we give a
phase transition theorem which generalizes the classical results on the
densities up to which a tuple of cyclically reduced words chosen uniformly at
random exponentially generically satisfies a small cancellation property, and
beyond which it presents a trivial group
Set Systems and Families of Permutations with Small Traces
We study the maximum size of a set system on elements whose trace on any
elements has size at most . We show that if for some the
shatter function of a set system satisfies then ; this generalizes Sauer's Lemma on the size of
set systems with bounded VC-dimension. We use this bound to delineate the main
growth rates for the same problem on families of permutations, where the trace
corresponds to the inclusion for permutations. This is related to a question of
Raz on families of permutations with bounded VC-dimension that generalizes the
Stanley-Wilf conjecture on permutations with excluded patterns
On the genericity of Whitehead minimality
We show that a finitely generated subgroup of a free group, chosen uniformly
at random, is strictly Whitehead minimal with overwhelming probability.
Whitehead minimality is one of the key elements of the solution of the orbit
problem in free groups. The proofs strongly rely on combinatorial tools,
notably those of analytic combinatorics. The result we prove actually depends
implicitly on the choice of a distribution on finitely generated subgroups, and
we establish it for the two distributions which appear in the literature on
random subgroups
Average Analysis of Glushkov Automata under a BST-Like Model
We study the average number of transitions in Glushkov automata built from random regular expressions. This statistic highly depends on the probabilistic distribution set on the expressions. A recent work shows that, under the uniform distribution, regular expressions lead to automata with a linear number of transitions. However, uniform regular expressions are not necessarily a satisfying model. Therefore, we rather focus on an other model, inspired from random binary search trees (BST), which is widely used, in particular for testing. We establish that, in this case, the average number of transitions becomes quadratic according to the size of the regular expression
Analysis of Algorithms for Permutations Biased by Their Number of Records
The topic of the article is the parametric study of the complexity of
algorithms on arrays of pairwise distinct integers. We introduce a model that
takes into account the non-uniformness of data, which we call the Ewens-like
distribution of parameter for records on permutations: the weight
of a permutation depends on its number of records. We show that
this model is meaningful for the notion of presortedness, while still being
mathematically tractable. Our results describe the expected value of several
classical permutation statistics in this model, and give the expected running
time of three algorithms: the Insertion Sort, and two variants of the Min-Max
search
Seed: an easy to use random generator of recursive data structures for testing
Random testing represents a simple and tractable way for software assessment. This paper presents the \seed tool that can be used for the uniform random generation of recursive data structures such as labelled trees and logical formulas. We show how \seed can be used in several testing contexts, from model based testing to performance testing. Generated data structures are defined by grammar-like rules, given in an XML format, multiplying \seed possible applications. Seed is based on combinatorial techniques, and can generate uniformly at random k structures of size with a time complexity in O(n^2+ kn\log n). Finally, \seed is available as a free Java application and a great effort has been made to make it easy-to-use
- …